AITopics | stochastic environment

An Exploration-by-Optimization Approach to Best of Both Worlds in Linear Bandits

Neural Information Processing SystemsApr-30-2026, 02:09:01 GMT

In this paper, we consider how to construct best-of-both-worlds linear bandit algorithms that achieve nearly optimal performance for both stochastic and adversarial environments. For this purpose, we show that a natural approach referred to as exploration by optimization [Lattimore and Szepesvári, 2020b] works well.

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.64)

Add feedback

fe90657b12193c7b52a3418bdc351807-Paper-Conference.pdf

Neural Information Processing SystemsApr-28-2026, 11:45:06 GMT

machine learning, reinforcement learning, trajectory, (18 more...)

Neural Information Processing Systems

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

0c8d3770cbb759430f4f4679abe3ab80-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 13:09:55 GMT

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Data Science > Data Mining > Big Data (0.47)

Add feedback

0a2f65c9d2313b71005e600bd23393fe-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 11:50:41 GMT

machine learning, rcsl, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)

Add feedback

024677efb8e4aee2eaeef17b54695bbe-Paper.pdf

Neural Information Processing SystemsApr-24-2026, 10:31:02 GMT

machine learning, reinforcement learning, safe region, (17 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.69)

Add feedback

Fast Rates in Stochastic Online Convex Optimization by Exploiting the Curvature of Feasible Sets

Neural Information Processing SystemsMar-22-2026, 05:14:13 GMT

In this work, we explore online convex optimization (OCO) and introduce a new condition and analysis that provides fast rates by exploiting the curvature of feasible sets. In online linear optimization, it is known that if the average gradient of loss functions exceeds a certain threshold, the curvature of feasible sets can be exploited by the follow-the-leader (FTL) algorithm to achieve a logarithmic regret. This study reveals that algorithms adaptive to the curvature of loss functions can also leverage the curvature of feasible sets. In particular, we first prove that if an optimal decision is on the boundary of a feasible set and the gradient of an underlying loss function is non-zero, then the algorithm achieves a regret bound of $O(\rho \log T)$ in stochastic environments. Here, $\rho > 0$ is the radius of the smallest sphere that includes the optimal decision and encloses the feasible set. Our approach, unlike existing ones, can work directly with convex loss functions, exploiting the curvature of loss functions simultaneously, and can achieve the logarithmic regret only with a local property of feasible sets. Additionally, the algorithm achieves an $O(\sqrt{T})$ regret even in adversarial environments, in which FTL suffers an $\Omega(T)$ regret, and achieves an $O(\rho \log T + \sqrt{C \rho \log T})$ regret in corrupted stochastic environments with corruption level $C$.

artificial intelligence, machine learning, proceedings, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.38)

Add feedback

Combining Adversarial Guarantees and Stochastic Fast Rates in Online Learning

Neural Information Processing SystemsMar-17-2026, 11:34:03 GMT

We consider online learning algorithms that guarantee worst-case regret rates in adversarial environments (so they can be deployed safely and will perform robustly), yet adapt optimally to favorable stochastic environments (so they will perform well in a variety of settings of practical importance). We quantify the friendliness of stochastic environments by means of the well-known Bernstein (a.k.a.

artificial intelligence, machine learning, proceedings, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.42)

Add feedback

Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning

Erwan Lecarpentier, Emmanuel Rachelson

Neural Information Processing SystemsFeb-19-2026, 18:11:37 GMT

This work tackles the problem of robust planning in non-stationary stochastic environments.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > France (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.83)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)

Add feedback

Fast Rates in Stochastic Online Convex Optimization by Exploiting the Curvature of Feasible Sets

Neural Information Processing SystemsFeb-17-2026, 17:26:20 GMT

In this work, we explore online convex optimization (OCO) and introduce a new condition and analysis that provides fast rates by exploiting the curvature of feasible sets. In online linear optimization, it is known that if the average gradient of loss functions exceeds a certain threshold, the curvature of feasible sets can be exploited by the follow-the-leader (FTL) algorithm to achieve a logarithmic regret. This study reveals that algorithms adaptive to the curvature of loss functions can also leverage the curvature of feasible sets. In particular, we first prove that if an optimal decision is on the boundary of a feasible set and the gradient of an underlying loss function is non-zero, then the algorithm achieves a regret bound of O ( ρ ln T) in stochastic environments. Here, ρ > 0 is the radius of the smallest sphere that includes the optimal decision and encloses the feasible set. Our approach, unlike existing ones, can work directly with convex loss functions, exploiting the curvature of loss functions simultaneously, and can achieve the logarithmic regret only with a local property of feasible sets. Additionally, the algorithm achieves an O ( T) regret even in adversarial environments, in which FTL suffers an Ω( T) regret, and achieves an O ( ρ ln T + Cρ ln T) regret in corrupted stochastic environments with corruption level C .

artificial intelligence, loss function, machine learning, (18 more...)

Neural Information Processing Systems

Country: